deep learning pipeline
Automated Multi-label Classification of Eleven Retinal Diseases: A Benchmark of Modern Architectures and a Meta-Ensemble on a Large Synthetic Dataset
Cao-Xue, Jerry, Comlekoglu, Tien, Xue, Keyi, Wang, Guanliang, Li, Jiang, Laurie, Gordon
The development of multi-label deep learning models for retinal disease classification is often hindered by the scarcity of large, expertly annotated clinical datasets due to patient privacy concerns and high costs. The recent release of SynFundus-1M, a high-fidelity synthetic dataset with over one million fundus images, presents a novel opportunity to overcome these barriers. To establish a foundational performance benchmark for this new resource, we developed an end-to-end deep learning pipeline, training six modern architectures (ConvNeXtV2, SwinV2, ViT, ResNet, EfficientNetV2, and the RETFound foundation model) to classify eleven retinal diseases using a 5-fold multi-label stratified cross-validation strategy. We further developed a meta-ensemble model by stacking the out-of-fold predictions with an XGBoost classifier. Our final ensemble model achieved the highest performance on the internal validation set, with a macro-average Area Under the Receiver Operating Characteristic Curve (AUC) of 0.9973. Critically, the models demonstrated strong generalization to three diverse, real-world clinical datasets, achieving an AUC of 0.7972 on a combined DR dataset, an AUC of 0.9126 on the AIROGS glaucoma dataset and a macro-AUC of 0.8800 on the multi-label RFMiD dataset. This work provides a robust baseline for future research on large-scale synthetic datasets and establishes that models trained exclusively on synthetic data can accurately classify multiple pathologies and generalize effectively to real clinical images, offering a viable pathway to accelerate the development of comprehensive AI systems in ophthalmology.
- North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
- Asia > China > Chongqing Province > Chongqing (0.04)
- North America > United States > Virginia > Norfolk City County > Norfolk (0.04)
- North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (0.68)
- Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (1.00)
- Health & Medicine > Diagnostic Medicine (1.00)
Deep Learning Pipeline for Fully Automated Myocardial Infarct Segmentation from Clinical Cardiac MR Scans
Schwab, Matthias, Pamminger, Mathias, Kremser, Christian, Mayr, Agnes
Purpose: To develop and evaluate a deep learning-based method that allows to perform myocardial infarct segmentation in a fully-automated way. Materials and Methods: For this retrospective study, a cascaded framework of two and three-dimensional convolutional neural networks (CNNs), specialized on identifying ischemic myocardial scars on late gadolinium enhancement (LGE) cardiac magnetic resonance (CMR) images, was trained on an in-house training dataset consisting of 144 examinations. On a separate test dataset from the same institution, including images from 152 examinations obtained between 2021 and 2023, a quantitative comparison between artificial intelligence (AI)-based segmentations and manual segmentations was performed. Further, qualitative assessment of segmentation accuracy was evaluated for both human and AI-generated contours by two CMR experts in a blinded experiment. Results: Excellent agreement could be found between manually and automatically calculated infarct volumes ($\rho_c$ = 0.9). The qualitative evaluation showed that compared to human-based measurements, the experts rated the AI-based segmentations to better represent the actual extent of infarction significantly (p < 0.001) more often (33.4% AI, 25.1% human, 41.5% equal). On the contrary, for segmentation of microvascular obstruction (MVO), manual measurements were still preferred (11.3% AI, 55.6% human, 33.1% equal). Conclusion: This fully-automated segmentation pipeline enables CMR infarct size to be calculated in a very short time and without requiring any pre-processing of the input images while matching the segmentation quality of trained human observers. In a blinded experiment, experts preferred automated infarct segmentations more often than manual segmentations, paving the way for a potential clinical application.
A deep learning pipeline for controlling protein interactions
One of the LPDI's de novo protein binders (red) bound to the protein Bcl2 (blue) in complex with FDA-approved drug Venetoclax (beige) LPDI EPFL In 2023, scientists in the joint School of Engineering and School of Life Sciences Laboratory of Protein Design and Immunoengineering (LPDI), led by Bruno Correia, published a deep-learning pipeline for designing new proteins to interact with therapeutic targets. MaSIF can rapidly scan millions of proteins to identify optimal matches between molecules based on their chemical and geometric surface properties, enabling scientists to engineer novel protein-protein interactions that play key roles in cell regulation and therapeutics. A year and a half later, the team has reported an exciting advancement of this technology. They have used MaSIF to design novel protein binders to interact with known protein complexes involving small molecules like therapeutic drugs or hormones. Because these bound small molecules induce subtle changes in the surface properties ('neosurfaces') of these protein-drug complexes, they can act as'on' or'off' switches for the fine control of cellular functions like DNA transcription or protein degradation.
Harnessing multiple LLMs for Information Retrieval: A case study on Deep Learning methodologies in Biodiversity publications
Kommineni, Vamsi Krishna, König-Ries, Birgitta, Samuel, Sheeba
Deep Learning (DL) techniques are increasingly applied in scientific studies across various domains to address complex research questions. However, the methodological details of these DL models are often hidden in the unstructured text. As a result, critical information about how these models are designed, trained, and evaluated is challenging to access and comprehend. To address this issue, in this work, we use five different open-source Large Language Models (LLMs): Llama-3 70B, Llama-3.1 70B, Mixtral-8x22B-Instruct-v0.1, Mixtral 8x7B, and Gemma 2 9B in combination with Retrieval-Augmented Generation (RAG) approach to extract and process DL methodological details from scientific publications automatically. We built a voting classifier from the outputs of five LLMs to accurately report DL methodological information. We tested our approach using biodiversity publications, building upon our previous research. To validate our pipeline, we employed two datasets of DL-related biodiversity publications: a curated set of 100 publications from our prior work and a set of 364 publications from the Ecological Informatics journal. Our results demonstrate that the multi-LLM, RAG-assisted pipeline enhances the retrieval of DL methodological information, achieving an accuracy of 69.5% (417 out of 600 comparisons) based solely on textual content from publications. This performance was assessed against human annotators who had access to code, figures, tables, and other supplementary information. Although demonstrated in biodiversity, our methodology is not limited to this field; it can be applied across other scientific domains where detailed methodological reporting is essential for advancing knowledge and ensuring reproducibility. This study presents a scalable and reliable approach for automating information extraction, facilitating better reproducibility and knowledge transfer across studies.
- Europe > Germany > Saxony > Leipzig (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- (3 more...)
Calibration-then-Calculation: A Variance Reduced Metric Framework in Deep Click-Through Rate Prediction Models
Fan, Yewen, Si, Nian, Song, Xiangchen, Zhang, Kun
Deep learning has been widely adopted across various fields, but there has been little focus on evaluating the performance of deep learning pipelines. With the increased use of large datasets and complex models, it has become common to run the training process only once and compare the result to previous benchmarks. However, this procedure can lead to imprecise comparisons due to the variance in neural network evaluation metrics. The metric variance comes from the randomness inherent in the training process of deep learning pipelines. Traditional solutions such as running the training process multiple times are usually not feasible in deep learning due to computational limitations. In this paper, we propose a new metric framework, Calibrated Loss Metric, that addresses this issue by reducing the variance in its vanilla counterpart. As a result, the new metric has a higher accuracy to detect effective modeling improvement. Our approach is supported by theoretical justifications and extensive experimental validations in the context of Deep Click-Through Rate Prediction Models.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > France > Île-de-France (0.04)
A deep learning pipeline for cross-sectional and longitudinal multiview data integration
Jain, Sarthak, Safo, Sandra E.
Biomedical research now commonly integrates diverse data types or views from the same individuals to better understand the pathobiology of complex diseases, but the challenge lies in meaningfully integrating these diverse views. Existing methods often require the same type of data from all views (cross-sectional data only or longitudinal data only) or do not consider any class outcome in the integration method, presenting limitations. To overcome these limitations, we have developed a pipeline that harnesses the power of statistical and deep learning methods to integrate cross-sectional and longitudinal data from multiple sources. Additionally, it identifies key variables contributing to the association between views and the separation among classes, providing deeper biological insights. This pipeline includes variable selection/ranking using linear and nonlinear methods, feature extraction using functional principal component analysis and Euler characteristics, and joint integration and classification using dense feed-forward networks and recurrent neural networks. We applied this pipeline to cross-sectional and longitudinal multi-omics data (metagenomics, transcriptomics, and metabolomics) from an inflammatory bowel disease (IBD) study and we identified microbial pathways, metabolites, and genes that discriminate by IBD status, providing information on the etiology of IBD. We conducted simulations to compare the two feature extraction methods. The proposed pipeline is available from the following GitHub repository: https://github.com/lasandrall/DeepIDA-GRU.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Norway > Norwegian Sea (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- (2 more...)
KP-RNN: A Deep Learning Pipeline for Human Motion Prediction and Synthesis of Performance Art
Perrine, Patrick, Kirkby, Trevor
Digitally synthesizing human motion is an inherently complex process, which can create obstacles in application areas such as virtual reality. We offer a new approach for predicting human motion, KP-RNN, a neural network which can integrate easily with existing image processing and generation pipelines. We utilize a new human motion dataset of performance art, Take The Lead, as well as the motion generation pipeline, the Everybody Dance Now system, to demonstrate the effectiveness of KP-RNN's motion predictions. We have found that our neural network can predict human dance movements effectively, which serves as a baseline result for future works using the Take The Lead dataset. Since KP-RNN can work alongside a system such as Everybody Dance Now, we argue that our approach could inspire new methods for rendering human avatar animation. This work also serves to benefit the visualization of performance art in digital platforms by utilizing accessible neural networks.
NeXtQSM -- A complete deep learning pipeline for data-consistent quantitative susceptibility mapping trained with hybrid data
Cognolato, Francesco, O'Brien, Kieran, Jin, Jin, Robinson, Simon, Laun, Frederik B., Barth, Markus, Bollmann, Steffen
Deep learning based Quantitative Susceptibility Mapping (QSM) has shown great potential in recent years, obtaining similar results to established non-learning approaches. Many current deep learning approaches are not data consistent, require in vivo training data or solve the QSM problem in consecutive steps resulting in the propagation of errors. Here we aim to overcome these limitations and developed a framework to solve the QSM processing steps jointly. We developed a new hybrid training data generation method that enables the end-to-end training for solving background field correction and dipole inversion in a data-consistent fashion using a variational network that combines the QSM model term and a learned regularizer. We demonstrate that NeXtQSM overcomes the limitations of previous deep learning methods. NeXtQSM offers a new deep learning based pipeline for computing quantitative susceptibility maps that integrates each processing step into the training and provides results that are robust and fast.
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Technology (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Exploration of Deep Learning pipelines made easy
The easiest way to show how ATOM can help you is through an example. This story walks you through a notebook that trains and validates a Convolutional Neural Network implemented with Keras. The model is trained using MNIST¹, a well known image dataset whose goal is to classify handwritten digits. We start with the necessary imports and defining the model: a simple neural network with two convolutional layers and one output layers consisting of 10 neurons, one for each digit. The dataset contains 28x28 grayscale images, therefore every image's array has shape (28, 28, 1).
Developing a Deep Learning Pipeline for Classifying Cassava Leaf Diseases
After loading in the data from the train and test data folders and setting up our simple base model, we decided it would be worth the effort to figure out how to upload the data in the TFRecords format. TFRecords is a binary storage format specifically designed to expedite performance and training time of models built in Tensor Flow. In essence, data in the TFRecords format is optimized for use with Tensorflow in various aspects. Despite the previously mentioned advantages of using this data format, getting the data into a format that is ready to feed into a model is not straightforward. Doing so requires defining functions to read the files and decode the images contained in those files. It is also logical to augment the data (flip, randomly change brightness, add saturation, etc.) in this step since the images will eventually be reshaped into arrays.